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Abstract 

j> Nonparametric estimation of the gap time distribution in a simple re- 

CN newal process may be considered a problem in survival analysis under 

particular sampling frames corresponding to how the renewal process 
is observed. This note describes several such situations where simple 
product limit estimators, though inefficient, may still be useful. 

o 
o 

T! 1 Introduction 

> 

^ This note is about two classical problems in nonparametric statistical analysis 

^ of recurrent event data, both formalised within the framework of a simple, 

stationary renewal process. 

We first consider observation around a fixed time point, i.e., we observe 
a backward recurrence time R and a forward recurrence time S. It is well 
known that the nonparametric maximum likelihood estimator of the gap- 
time distribution is the Cox-Vardi estimator (Cox 1969, Vardi 1985) derived 
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from the length-biased distribution of the gap time R + S. However, Winter 
& Foldes (1988) proposed to use a product-Umit estimator based on S, with 
delayed entry given by R. Keiding & Gill (1988) clarified the relation of 
that estimator to the standard left truncation problem. Unfortunately this 
discussion was omitted from the published version (Keiding & Gill, 1990). 
Since these simple relationships do not seem to be on record elsewhere, we 
offer them here. 

The second observation scheme considers a stationary renewal process ob- 
served in a finite interval where the left endpoint does not necessarily corre- 
spond to an event. The full likelihood function is complicated, and we briefly 
survey possibilities for restricting attention to various partial likelihoods, in 
the nonparametric case again allowing the use of simple product-limit esti- 
mators. 

2 Observation of a stationary renewal 
process around a fixed point 

Winter & Foldes (1988) studied the following estimation problem. Consider 
n independent renewal processes in equilibrium with underlying distribution 
function F, which we shall assume absolutely continuous with density /, 
minimal support interval (0,oo), and hazard /3{t) — f{t)/{l — F{t)), t > 0. 
The reason for our unconventional choice /3 for the hazard rate belonging 
to F will become apparent later. Corresponding to a fixed time, say 0, the 
backward and forward recurrence times Ri and Si, i = I, ...,n, are observed; 
their sums Qi — Ri + Si are length- biased observations from F, i.e., their den- 
sity is proportional to tf{t). Let (i?, S, Q) denote a generic triple {Ri, Si, Qi). 
We quote the following distribution results: let // be the expectation value 
corresponding to the the distribution F, 



then the joint distribution of R and S has density f{r + s)/n , the marginal 
distributions of R and S are equal with density (1— F(r)) / /i, and the marginal 
distribution oi Q = R + S has density qf{q) / jJ., the length-biased density 
corresponding to /. 

Winter and Foldes considered the product-limit estimator 





2 



where 

n 

Y{t) = Y,HRi<t<Ri + Si} 
1=1 

is the number at risk at time t. This estimator is the same as the Kaplan- 
Meier estimator for iid survival data Qi, . . . , Qn left-truncated at . . . , Rn 
(Kaplan & Meier 1958, Andersen ct al. 1993). Winter & Foldes showed that 
1 — F is strongly consistent for the underlying survival function 1 — F . 

We shall show how the derivation of this estimator follows from a simple 
Markov process model similar to the one used by Keiding & Gill (1990) to 
study the random truncation model. 

First notice that the conditional distribution oi Q — R + S given that 
R = r has density 

, r < g < oo 

(1 - F{r))/n 

that is, intensity (hazard) f{q)/{l — F(q)), q > r, which is just the hazard 
j3{q) corresponding to the underlying distribution F left-truncated at r. Now 
define corresponding to (i?, Q) a stochastic process U on [0, oo] with state 
space {0, 1, 2} by 

t < R, 

U{t) = <( 1, R < t < R + S, 

R + S < t. 

Note that it takes in successsion the values 0, 1 and 2. For U (t) = 0, 

P{U(t + /i) = 1 I U(u), 0<u<t} 
= P{R <t + h\R>t} 
— a{h)h + o{h), 

where a is the hazard rate of the marginal distribution of R. For U{t) = 1 
(and hence R <t) 

P{U{t + h) = 2\ U{u), 0<u< t} 

= P{R + S<t + h\ R = r<t,R + S>t} 

by the above result on the conditional hazard of R+S given R. For U (t) — 0, 

P{U{t + h)=2\U{u),0<u<t} = o{h). 
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Figure 1: Inhomogenous 3-state Markov process, 2 allowed transitions 



Other transitions are impossible. 

That these conditional probabilities depend on U (t) and t only, but not 
on U{u), u < t, proves that U is a. Markov process with intensities 

a{t) 



/~(1-F(r))dr 

(the marginal hazard of R, equal to the residual mean lifetime function of 
the underlying distribution F) and 



l-F(tY 

see Figure 1. The Markov process framework of Keiding & Gill (1990) now 
indicates that (ignoring information about F in a, and just focussing on 
the transition with rate /3) the product limit estimator 1 — F is a natural 
estimator of the survivor function 1 — F of interest, and consistency and 
asymptotic normality may be obtained as shown by Keiding & Gill (1990, 
Sec. 5). 

Note that the backwards intensity 

^'p{u{t)^i} 

P{R>t} 



= a{t) 



a{t) 



P{R <t<R + S] 
„-'J^{l-F{r))dr 

_ 1-F{t) X°°(l-F(r))dr _ 1 
" ^(1 - F(r))dr J^(l - F{t))dr ~ 

the backwards hazard-rate of a uniform distribution on a bounded interval 
(0, ^4), ^4 < oo. Since it has been assumed that R has support interval (0, oo), 
this shows that the present model may not be interpreted strictly as a left 
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truncation model, which would require that a{t) was the backwards hazard 
rate of some probability distribution on (0, oo). However, this distinction is 
not important to our discussion. 

The fact that a{t) does not depend on F corresponds to Winter and 
Foldes' statement that {R, S) contains no more information than R+S about 
F. This already follows from sufficiency since the joint density of {R, S) is 
f{r-\-s)/ii. The hkehhood function based on observation of Si), . . . {R^ Sn) 
is 



i=l 



from which the NPMLE of F is readily derived as 

P ^ ^ I{R. + S.<t} 1 

R. + S, I ^^Ri + S,' 

that is the Cox-Vardi estimator in the terminology of Winter and Foldes 
(Cox 1969, Vardi 1985). 

It follows that the estimator 1 — F is not NPMLE. The important dif- 
ference between the situation here and that of the random truncation model 
studied by Keiding & Gill (1990, Sec. 3) is that not only the intensity /3(t), 
but also a{t) depends only on the estimand F. 

As already mentioned, weak convergence of 1 — F is immediate from 
Keiding & Gill (1990, Sec. 5). In particular, in order to achieve the extension 
to convergence on [0, M] it should be required that 

/ d$(s)/i/2(s) < oo 

in the terminology of Keiding & Gill (1990, Sec. 5c), and using d$(t) = /3{t)dt 
and 



the integrability condition translates into 

Pit) 



I 



P{U{t) = l} 



dt < oo 



or finiteness of E{l/X) where X has the underlying ("length-unbiased") 
interarrival time distribution F. It may easily be seen from Gill et al. (1988) 
that the same condition is needed to ensure weak convergence of the Gox- 
Vardi estimator. 
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A variation of the observation scheme of this section would be to al- 
low also right censoring of the S^. This can be immediately included in the 
Markov-process/counting process approach leading to the inefficient product- 
limit type estimator 1 — F; the delayed-entry observations Si are simulta- 
neously right-censored. See Vardi (1985, 1989) and Asgharian et al. (2002) 
for treatment of the full non-parametric maximum likelihood estimator of F, 
extending the Cox- Vardi estimator to allow right censoring. 

Other ad hoc estimators and the rich relationships with a number of other 
important non-parametric estimation problems are discussed by Denby and 
Vardi (1985) and Vardi (1989). 

3 Observation of a stationary renewal 
process in a finite interval 

We consider again a stationary renewal process on the whole line and as- 
sume that we observe it in some interval [ti,t2\ determined independently 
of the process. Nonparamctric estimation of the gap time distribution F 
was definitively discussed by Vardi (1982) in discrete time and by Soon & 
Woodroofc (1996) in continuous time. Cook & Lawless (2007, Chapter 4) 
surveyed the general area of analysis of gap times emphasizing that the as- 
sumption of independent gap times is often unrealistic. 

We shall here nevertheless work under the assumption of the simplest 
possible model as indicated above. Because the nonparamctric maximum 
likelihood estimator is computationally involved it may sometimes be useful 
to calculate less efficient alternatives, and there are indeed such possibilities. 

Under the observation scheme indicated above we may have the following 
four types of elementary observations 

1. Times Xi from one renewal to the next, contributing the density f{xi) to 
the likelihood. 

2. Times from one renewal T to t2, which are right-censored versions of 1., 
contributing factors of the form (1 — F{t2 — T)) to the likelihood. 

3. Times from ti to the first renewal T (forward recurrence times), contribut- 
ing factors of the form (1 — F(T — ii))/yU to the hkehhood. 

4. Knowledge that no renewal happened in [ti,t2] , actiially a right-censored 
version of 3., contributing factors of the form J^_^^{1 — F{u))(iu/ii to the 
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likelihood. 

McClean & Dcvine (1995) studied nonparametric maximum likelihood 
estimation in the conditional distribution given that there is at least one 
renewal in the interval, i.e., that there are no observations of type 4. 

Our interest is in basing the estimation only on complete or right-censored 
gap times, i.e., observations of type 1 or 2. When this is possible, we have 
simple product-limit estimators in the one-sample situation, and we may 
use well-established regression models (such as Cox regression) to account 
for covariates. Pena et al. (2001) assumed that observation started at a 
renewal (thereby defining away observations of type 3 and 4) and gave a 
comprehensive discussion of exact and asymptotic properties of product-limit 
estimators with comparisons to alternatives, building in particular on results 
of Gill (1980, 1981) and Sellke (1988). The crucial point here is that calendar 
time and time since last renewal both need to be taken into account, so the 
straightforward martingale approach displayed by Andersen et al. (1993) is 
not available. Pena et al. also studied robustness to deviations from the 
assumption of independent gap times. 

As noted by Aalen & Huscbyc (1991) in their attractive non-technical 
discussion of observation patterns, observation does however often start be- 
tween renewals. (In the example of Keiding et al. (1998), auto insurance 
claims were considered in a fixed calendar period). As long as observation 
starts at a stopping time, inference is still valid, so by starting observation 
at the first renewal in the interval we can essentially refer back to Pena et 
al. (2001). A more formal argument could be based on the concept of the 
Aalen filter, see Andersen et al. (1993, p. 164). The resulting product-hmit 
estimators will not be fully efficient, since the information in the backward 
recurrence time (types 3 and 4) is ignored. It is important to realize that the 
validity of this way of reducing the data depends critically on the indepen- 
dence assumptions of the model. Keiding et al. (1998), cf. Keiding (2002) 
for details, used this fact to base a goodness-of-fit test on a comparison of 
the full nonparametric maximum likelihood estimator with the product-limit 
estimator. 

Similar terms appear in another model, called the Laslett line segment 
problem (Laslett, 1982). Suppose one has a stationary Poisson process, with 
intensity of points on the real line. We think of the real line as a calendar 
time axis, and the points of the Poisson process will be called pseudo renewal 
times or birth times of some population of individuals. Suppose the individ- 
uals have independent and identically distributed lifetimes, each one starting 
at the corresponding birth time. The corresponding calendar time of the end 
of each lifetime can of course be called a death time. Now suppose that all 
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we can observe are the intersections of individuals' lifetimes (thought of as 
time segments on the time axis) with an observational window [ti,t2]- In 
particular, we do not know the current age of an individual who is observed 
alive at time ti. Again we have exactly the same four kinds of observations: 

1. Complete proper lifetimes corresponding to births within [ti, for which 
death occurred before time ^2- 

2. Censored proper lifetimes corresponding to births within [^1,^2] for which 
death occurred after time ^2- 

3. Complete residual lifetimes corresponding to births which occurred at an 
unknown moment before time ti, and for which death occurred after ti and 
before time ^2- 

4. Censored residual lifetimes corresponding to births which occurred at an 
unknown moment before time ti, for which death occurred after time t2, and 
which are therefore censored at time ^2- 

The number of at least partially observed lifetimes (proper or residual) is 
random, and Poisson distributed with mean equal to the intensity of the 
underlying Poisson process of birth times, times the factor 



This provides a fifth, "Poisson" , factor in the nonparamctric likelihood func- 
tion for parameters /j, and F, based on all the available data. Maximizing over 
II and F, the mean of the Poisson distribution is estimated by the observed 
number of partially observed lifetimes. Thus wc find that the profile likeli- 
hood for F, and the marginal likelihood for F based only on contributions 
1.-4., arc proportional to one another. 

Nonparamctric maximum likelihood estimation of F was studied by Wij- 
ers (1995) and van der Laan (1996), cf. van der Laan & Gill (1999). Some 
of their results, and the calculations leading to this likelihood, were surveyed 
by Gill (1994, pp. 190 ff.). The nonparamctric maximum likelihood estima- 
tor is consistent; whether or not it converges in distribution as fi tends to 
infinity is unknown, the model has a singularity coming from the vanishing 
probabihty density of complete lifetimes just larger than the length of the 
observation window corresponding to births just before the start of the ob- 
servation window and deaths just after its end. Van der Laan showed that 
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a mild reduction of the data by grouping or binning leads to a much bet- 
ter behaved nonparametric maximum likelihood estimator. If the amount 
of binning decreases at an appropriate rate as n increases, this leads to an 
asymptotically efficient estimator of F. This procedure can be thought of as 
regularization, a procedure often needed in nonparametric inverse statistical 
problems, where maximum likelihood can be too greedy. 

Both "unregularized" and regularized estimators are easy to compute 
with the EM algorithm; and the speed of the algorithm is not so painfully 
slow as in other inverse problems, since this is still a problem where "root n" 
rate estimation is possible. 

The problem allows, just as we have seen in earlier sections, all the same 
inefficient but rapidly computable product-limit type estimators based on 
various marginal likelihoods. Moreover since the direction of time is basically 
irrelevant to the model, one can also look at the process "backwards" , leading 
to another plethora of inefficient but easy estimators. One can even combine 
in a formal way the censored survival data from a forward and a backward 
time point of view, which comes down to counting all uncensored observations 
twice, all singly censored once, and discarding all doubly censored data. (This 
idea was essentially suggested much earlier by R.C. Palmer and D.R. Cox, 
cf. Palmer (1948)). The attractive feature of this estimator is again the ease 
of computation, the fact that it only discards the doubly censored data, 
and its symmetry under reversing time. The asymptotic distribution theory 
of this estimator is of course not standard, but using the nonparametric 
delta method one can fairly easily give formulas for asymptotic variances and 
covariances. In practice one could easily and correctly use the nonparametric 
bootstrap, resampling from the partially observed lifetimes, where again a 
resampled complete lifetime is entered twice into the estimate. 

The Laslett line segment problem has rather important extensions to ob- 
servation of line segments (e.g., cracks in a rock surface) observed through an 
observational window in the plane. Under the assumption of a homogenous 
Poisson line segment process one can write down nonparametric likelihoods, 
maximize them with the EM algorithm; it seems that regularization may 
well be necessary to get optimal "root n" behaviour but in principle it is 
clear how this might be done. Again, we have the same plethora of ineffi- 
cient but easy product-limit type estimators. Van Zwet (2004) studied the 
behaviour of such estimators when the line segment process is not Poisson, 
but merely stationary. The idea is to use the Poisson process likelihood as 
a quasi likelihood, i.e., as a basis for generating estimating equations, which 
will be unbiased but not efficient, just as in parametric quasi-likelihood. Van 
Zwet shows that this procedure works fine. Coming full circle, one can apply 
these ideas to the renewal process we first described in this section, and the 
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other models described in earlier sections. All of them generate stationary 
line segment processes observed through a finite time window on the line. 
Thus the nonparametric quasi-likelihood approach can be used there too. 
Since in the renewal process case we are ignoring the fact that the intensity 
of the point process of births equals the inverse mean life-time, we do not 
get full efficiency. So it is disputable whether it is worth using an inefficient 
ad-hoc estimator which is difficult to compute when we have the options of 
Soon and Woodroofe's fully efficient (but hard to compute) full nonparamet- 
ric maximum likelihood estimator, and the many inefficient but easy and 
robust product-limit type estimators of this paper. 
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